This post will explore the Tidy tuesday 2020-11-10 dataset. It provides mobile and landline subscription indicators per country from 1990 to 2017.

The data has already been cleaned using this script
(I’ll show the cleaning process without running it -copied from the github link-)

Let’s import the data

Tidy tuesdays datasets can be downloaded using the tidytuesdayR package.
The dataset come with two objects: a mobile and a landline dataframes.

We will be using the following packages: - tidytuesdayR => to get the data
- countrycode, janitor => not used (used in the cleaning step)
- ggthemes=> for better looking ggplots
- gganimate => to animate a ggplot
- gridExtra => to make subplots

# tidytuesdayR package 
if(!require("tidytuesdayR")){install.packages("tidytuesdayR")}
# Either ISO-8601 date or year/week works!
tuesdata <- tidytuesdayR::tt_load('2020-11-10')
## 
##  Downloading file 1 of 2: `mobile.csv`
##  Downloading file 2 of 2: `landline.csv`
mobile <- tuesdata$mobile
landline <- tuesdata$landline

# Other packages 
if(!require("countrycode")){install.packages("countrycode")}
if(!require("tidyverse")){install.packages("tidyverse")}
if(!require("janitor")){install.packages("janitor")}
if(!require("ggthemes")){install.packages("ggthemes")}
if(!require("gganimate")){install.packages("gganimate")}
if(!require("gridExtra")){install.packages("gridExtra")}

library(tidyverse)
Cleaning script
############### copied and not run from https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-11-10/readme.md
## Shown for information  

mobile_df <- raw_mobile %>% 
  janitor::clean_names() %>% 
  rename(
    total_pop = 4,
    "gdp_per_cap" = 6,
    "mobile_subs" = 7
  ) %>% 
  filter(year >= 1990) %>% 
  select(-continent) %>% 
  
  mutate(continent = countrycode::countrycode(
    entity,
    origin = "country.name",
    destination = "continent"
  )) %>% 
  filter(!is.na(continent))

landline_df <- raw_landline %>% 
  janitor::clean_names() %>% 
  rename(
    total_pop = 4,
    "gdp_per_cap" = 6,
    "landline_subs" = 7
  ) %>% 
  filter(year >= 1990) %>% 
  select(-continent) %>% 
  mutate(continent = countrycode::countrycode(
    entity,
    origin = "country.name",
    destination = "continent"
  )) %>% 
  filter(!is.na(continent))

mobile_df %>% 
  write_csv("2020/2020-11-10/mobile.csv")

landline_df %>% 
  write_csv("2020/2020-11-10/landline.csv")

Data Exploration

Let’s explore the mobile and landline subscriptions trend per continent .

library(ggplot2)
library(ggthemes)
library(gridExtra)

# Mobile Data summary 
mobile_mean_plot<-mobile%>% 
  group_by(continent, year)%>% 
  summarise(mobile_subs_mean=mean(mobile_subs, na.rm = T))%>%
  {# to pass the dot 
  # ggplot aes 
  ggplot(.,aes(year,mobile_subs_mean,color=continent, group=continent))+
  # lines
  geom_line()+
  geom_hline(yintercept = 100, color="gray")+
  # aspect 
  scale_y_continuous(breaks = seq(0, max(.$mobile_subs_mean), 20))+
  labs(title = "Mean mobile subscriptions (per 100 persons) per continent")+
  theme_hc()
  }

# Landline Data summary 
landline_mean_plot<- landline%>% 
  group_by(continent, year)%>% 
  summarise(landline_subs_mean=mean(landline_subs, na.rm = T))%>%
  # ggplot
  ggplot(aes(year,landline_subs_mean,color=continent, group=continent))+
  # line
  geom_line()+
  # aspect
  labs(title = "Mean landline subscriptions (per 100 persons) per continent")+
  theme_hc()

# Plot both graphs 
gridExtra::grid.arrange(mobile_mean_plot,landline_mean_plot, ncol=2)

Mobile subscription grew very rapidly in all continents since 1990. Africa was the last continent to start this transition no numeric devices.

However, the rise of African mobile users since 2004 - 2005 is remarkable. In 2000 Africa had 2.54 mobile users/100 while Europe had 40.3 users/100 people (in other words, Europe was, o, average, 15 to 16 times more equipped in mobile devices than Africa). This vast subscription difference became smaller every year, in 2010 Africa already had 55.8 mobile subscription/ 100 people while Europe had 116 (this time Europe was twice as equipped with mobiles than Africa). In 2017, Europe had only 40% more mobiles subcriptions per 100 people than Africa.

It’s also interesting to see that since 2007, Europe citizens had, on average, more than one mobile subscription per person. This upward trend seems to plateau since 2010 for Europe and Americas.

Landline subscriptions followed an interesting path. The growth was not as steady as with mobile devices. For most continents, subscriptions grew until the early 2000s and then it started to fall down. In 2017, Europe is as equipped with landlines as it was in 1990 (35.5 vs 33.5 per 100 people). The downward trend seems to be less noticeable for Africa an Asia.

Missing data

A lot of countries have missing values for some years. The continent that suffers the most from missing data is Oceania followed by the Americas. In 2015 the percentage of missing values increased for every continent. Overall, Asia seems to be the continent with the more complete data in this dataset.

mobile_missing<-mobile%>% 
  # Summary 
  mutate(mobile_is_na= ifelse(is.na(mobile_subs),"NA value","Non NA value"))%>%
  group_by(year, continent)%>%
  summarise(number_of_rows= n(),
            mobile_is_na= sum(mobile_is_na=="NA value"),
            mobile_is_na_pct= mobile_is_na/number_of_rows)%>%
  # ggplot
  ggplot(aes(year, mobile_is_na_pct,color=continent))+
  # line
  geom_line(position=position_dodge(width = 0.9))+
    # aspect
  scale_x_continuous(breaks = seq(1990, 2017, 2))+
  labs(title = "Percentage of missing mobile subscriptions data")+
  theme_hc()

landline_missing<-landline%>% 
  # Summary data
  filter(!year %in% c(2018:2019))%>% # this years don't have landline data 
  mutate(landline_is_na= ifelse(is.na(landline_subs),"NA value","Non NA value"))%>%
  group_by(year, continent)%>%
  summarise(number_of_rows= n(),
            landline_is_na= sum(landline_is_na=="NA value"),
            landline_is_na_pct= landline_is_na/number_of_rows)%>%
  # ggglot
  ggplot(aes(year, landline_is_na_pct,color=continent))+
  # line
  geom_line(position=position_dodge(width = 0.9))+
  # aspect 
  scale_x_continuous(breaks = seq(1990, 2017, 2))+
  labs(title = "Percentage of missing landline subscriptions data")+
  theme_hc()

# Plot both graphs 
gridExtra::grid.arrange(mobile_missing,landline_missing, ncol=2)

Evolution per year

Let’s calculate growth rates per year for every continent regarding mobile subscriptions. We will use dplyr handy functions like lag and the power of grouping.

# Calculate growt rates per year
rates<-mobile%>% 
  group_by(year, continent)%>% 
  summarise(mobile_subs_mean=mean(mobile_subs, na.rm = T))%>%
  arrange(continent)%>%
  ungroup()%>%
  group_by(continent)%>%
  mutate(evolution= (mobile_subs_mean-lag(mobile_subs_mean))/lag(mobile_subs_mean))

rates%>%
  # ggplot()
  ggplot(aes(x=year, y=evolution,fill=continent, color=continent))+
  # lines and points 
  geom_point()+
  geom_line(alpha=0.2)+
  # aspect 
  theme_hc()+
  scale_x_continuous(breaks = seq(1990, 2017, 2))+
  labs("Mobile subscriptions growth rate per continent")

As we saw with the precedent graph, Africa joined the trend tardily but when it did, for several years, the mobile subscription per 100 people more than doubled. The other continents had some ups and downs regarding the growth rate per year. Since 2010, the five continents display a very similar mobile subscriptions growth per year.

Animate our last plot

We will use gganimate to turn our plot more dynamic.
The growth rate is limited at 250% (there were some outliers present).

library(gganimate)

country_rates<-mobile%>% 
  group_by(year,continent, entity)%>% 
  summarise(mobile_subs_mean=mean(mobile_subs, na.rm = T),
            pop=total_pop)%>%
  arrange(entity)%>%
  ungroup()%>%
  group_by(entity)%>%
  mutate(evolution= (mobile_subs_mean-lag(mobile_subs_mean))/lag(mobile_subs_mean) ,# growth rates 
         evolution=na_if(evolution, Inf) ,# remove inf values 
         year=as.integer(year) # so years act as whole numbers in gganimate 
         )

evol_annimate<-ggplot(country_rates, aes(year, evolution, colour = entity)) +
  # points and lines 
  geom_point(alpha = 1, show.legend = FALSE, aes(size = pop)) +
  geom_line(alpha=0.3,size=2)+
  # aspect 
  scale_y_continuous(limits=c(0,2.5))+ 
  scale_x_continuous(breaks = seq(1990, 2017, 2))+
  scale_size(range = c(2, 12)) +
  theme(legend.position = 'none')+
  theme_hc()+
  # faceting 
  facet_wrap(~continent) +
  # gganimate
  transition_reveal(year) + # to reveal the graph gradually 
  labs(title = 'Year: {frame_along}', # displays the year 
       x = 'Year', y = 'Growth rate of mobile subscriptions')
animate(evol_annimate,height = 800, width =800,renderer = gifski_renderer())

  # if you want to save it 
#anim_save(filename="gif.continents.evol.gif", animation = last_animation(),height = 800, width =800)

Showing shadows instead of lines:

 evol_annimate<-ggplot(country_rates, aes(year, evolution, colour = continent)) +
  # points and lines 
  geom_point(alpha = 1, show.legend = FALSE, aes(size = pop)) +
  # aspect 
  scale_y_continuous(limits=c(0,2.5))+ 
  scale_x_continuous(breaks = seq(1990, 2017, 2))+
  scale_size(range = c(2, 12)) +
  theme(legend.position = 'none')+
  theme_hc()+
  # faceting 
  facet_wrap(~continent) +
  # gganimate
  transition_time(year) + # to reveal the graph gradually 
  labs(title = 'Year: {frame_time}', # displays the year 
       x = 'Year', y = 'Growth rate of mobile subscriptions')+
   shadow_mark(past=T, alpha = 0.3, size = 0.9)

animate(evol_annimate,height = 800, width =800,renderer = gifski_renderer())